palette: make palette conversion thread safe
Conversion from RGBA u8 to an 8-bit palette format caches conversion
results in a hash table, belonging to the palette model. Currently,
manipulation of the hash table is not thread safe -- when multiple
threads convert to the same palette format concurrently, the result
may be wrong. In particular, there is a race condition when two
different colors that share the same hash are converted concurrently.
Fix this by changing the hash table layout, so that it can be
modified atomically. We assume that aligned 32-bit writes are
atomic.
Note that the new layout is only suitable for palettes with up to
256 colors, but this is all we use the hash table for ATM anyway.
Add a regression test.